Finding the Position of the k-Mismatch and Approximate Tandem Repeats
نویسندگان
چکیده
Given a pattern P , a text T , and an integer k, we want to find for every position j of T , the index of the k-mismatch of P with the suffix of T starting at position j. We give an algorithm that finds the exact index for each j, and algorithms that approximate it. We use these algorithms to get an efficient solution for an approximate version of the tandem repeats problem with k-mismatches.
منابع مشابه
ProRepeat: an integrated repository for studying amino acid tandem repeats in proteins
ProRepeat (http://prorepeat.bioinformatics.nl/) is an integrated curated repository and analysis platform for in-depth research on the biological characteristics of amino acid tandem repeats. ProRepeat collects repeats from all proteins included in the UniProt knowledgebase, together with 85 completely sequenced eukaryotic proteomes contained within the RefSeq collection. It contains non-redund...
متن کاملAn Algorithm for Approximate Tandem Repeats
A perfect single tandem repeat is defined as a nonempty string that can be divided into two identical substrings, e.g., abcabc. An approximate single tandem repeat is one in which the substrings are similar, but not identical, e.g., abcdaacd. In this paper we consider two criterions of similarity: the Hamming distance (k mismatches) and the edit distance (k differences). For a string S of lengt...
متن کاملFinding Approximate Tandem Repeats with the Burrows-Wheeler Transform
Approximate tandem repeats in a genomic sequence are two or more contiguous, similar copies of a pattern of nucleotides. They are used in DNA mapping, studying molecular evolution mechanisms, forensic analysis and research in diagnosis of inherited diseases. All their functions are still investigated and not well defined, but increasing biological databases together with tools for identificatio...
متن کاملTandem repeats: two different approaches
where the pattern composed by the three bases CGG in the left sequence is transformed in five identical and adjacent copies. Additional mutational laws can cause mismatch between the bases of the pattern and the bases of the repeated copies, for this reason sometimes there is no an exact repetition of the pattern but there is an approximate repetition in which every copies of the pattern is qui...
متن کاملSegmental Duplications as a Complement Strategy to Short Tandem Repeats in the Prenatal Diagnosis of Down Syndrome
Background: Quantitative fluorescence-polymerase chain reaction (QF-PCR) is an inexpensive and accurate method for the prenatal diagnosis of aneuploidies that applies short tandem repeats (STRs) as a chromosome-specific marker. Despite its apparent advantages, QF-PCR is not applicable in all cases due to the presence of uninformative STRs. This study was carried out to investigate the efficienc...
متن کامل